5 research outputs found

    Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception

    Full text link
    We introduce the Aria Digital Twin (ADT) - an egocentric dataset captured using Aria glasses with extensive object, environment, and human level ground truth. This ADT release contains 200 sequences of real-world activities conducted by Aria wearers in two real indoor scenes with 398 object instances (324 stationary and 74 dynamic). Each sequence consists of: a) raw data of two monochrome camera streams, one RGB camera stream, two IMU streams; b) complete sensor calibration; c) ground truth data including continuous 6-degree-of-freedom (6DoF) poses of the Aria devices, object 6DoF poses, 3D eye gaze vectors, 3D human poses, 2D image segmentations, image depth maps; and d) photo-realistic synthetic renderings. To the best of our knowledge, there is no existing egocentric dataset with a level of accuracy, photo-realism and comprehensiveness comparable to ADT. By contributing ADT to the research community, our mission is to set a new standard for evaluation in the egocentric machine perception domain, which includes very challenging research problems such as 3D object detection and tracking, scene reconstruction and understanding, sim-to-real learning, human pose prediction - while also inspiring new machine perception tasks for augmented reality (AR) applications. To kick start exploration of the ADT research use cases, we evaluated several existing state-of-the-art methods for object detection, segmentation and image translation tasks that demonstrate the usefulness of ADT as a benchmarking dataset

    Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices

    No full text

    Robust Silhouette Extraction from Kinect

    No full text
    Natural User Interfaces allow users to interact with virtual environments with little intermediation. Immersion becomes a vital need for such interfaces to be successful and it is achieved by making the interface invisible to the user. For cognitive rehabilitation, a mirror view is a good interface to the virtual world, but obtaining immersion is not straightforward. An accurate player profile, or silhouette, accurately extracted from the real-world background, increases both the visual quality and the immersion of the player in the virtual environment. The Kinect SDK provides raw data that can be used to extract a simple player profile. In this paper, we present our method for obtaining a smooth player profile extraction from the Kinect image streams
    corecore